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The present invention relates to the field of document management, and more 
particularly, to a system for providing document management for the organization, 
handling, and retention of personal documents. 

BACKGROUND 



*p Financial documents commonly found in the home such as bills, invoices, 

W receipts, bank and brokerage statements, and tax records tend to pile up over time and 

2; become difficult to manage effectively. The volume of paper itself becomes a storage 

y L problem, and if the person's house is destroyed in a disaster such as an earthquake or 

fjj 1 5 fire, the owner of the documents can lose valuable records and financial information. 

jj Existing approaches for managing personal documents include Internet banking 

and bill payment web sites, spreadsheet programs, and software for organizing 
electronic documents. One problem with existing approaches is that they do not relate 
the different types of documents to each other. Existing systems also do not provide a 
20 centralized repository for storing and managing documents. Accordingly, there is a 
need in the art for a personal document management system that provides centralized 
organization, handling, and retention capabilities. 
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SUMMARY 

The present invention provides a document management system that provides 
centralized organization, handling, and retention capabilities. Documents in paper and 
electronic document format are loaded into the system, important information is 
5 extracted, and the documents are handled appropriately based on knowledge about the 
information flow between documents and the transactions that are associated with each 
document. Document specific handling procedures use the extracted information to 
relate documents to each other and to provide information about activities related to the 
Q documents, for example, bill-payment. Decisions about when to keep and when to 
J: 10 discard documents are made in order to determine which documents should be backed 

!; t up to a secondary or remote location for later retrieval. Documents that are to be kept 

CP 

iff] are downloaded via the system's centralized backup capability, thus allowing a user to 

H download documents to a location outside the home or office, allowing for retrieval if the 

Hi original documents are destroyed. 

■ 

%D 15 BRIEF DESCRIPTION OF THE DRAWINGS 

A more complete appreciation of the invention and many of its attendant 
advantages will be readily obtained and understood by referring to the following detailed 
description and the accompanying drawings in which like reference numerals denote 
like elements as between the various drawings. The drawings are briefly described 
20 below. 
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FIG. 1 is a block diagram illustrating a personal document management system 
in an embodiment of the present invention. 

FIG. 2 is a flowchart illustrating steps that are performed in a method for 
managing personal documents in an embodiment of the present invention. 

5 DETAILED DESCRIPTION 

An embodiment of the present invention provides a system for personal 
document management. The system of the present invention performs methods that 
may be implemented on a computer system having a computer-readable medium and 

y3 may be performed using computer-executable instructions. The computer-executable 

4- 

10 instructions may be included in a computer program product. The methods may also 

21 include transferring a computer program product from one or more first computers to 

y s 

* "* one or more second computers through a communications medium. 

1*3: 

?~ 3 
s U 

sfi FIG. 1 is a block diagram 100 illustrating a document management system in an 

gg embodiment of the present invention. Such a document management system would be 

■43 

15 useful for personal documents or in a small office/home office (SOHO) environment. 
Document management system 100 includes a local computer system 102, such as a 
personal computer and a remote computer system 104 such as that provided by an off- 
site Internet service provider. Local computer system 102 and remote computer system 
104 are connected to each other through a communications network 106. The local 

20 computer system 102 includes a processor 108 having local storage 110. Storage 110 
includes an operating environment 112 and software 114 configured to provide 
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document handling according to an embodiment of the present invention. Local 
computer system 102 also includes a scanner 1 16 and other input devices 118, such as 
a keyboard or a computer-readable medium. Paper documents 120 are input to the 
system via the scanner 116, which converts the documents to electronic format and 
5 loads them into the storage 110. Electronic documents (not shown) may be loaded into 
storage 110 without being scanned, via input 118 to processor 108. Local computer 
system 102 also includes an output device 122, for example a printer or a display 
device that is connected to processor 1 08. 

Q Remote computer system 104 may be used for securely backing up documents 

S 10 to be retained. The documents to be retained are organized for effective use and are 
jW securely backed up to remote computer system 104 using a communications network 
Jf; 106, such as the Internet. Remote computer system 104 includes a processor 124 that 
is connected to a remote storage device 126. One reason for backing up the 
ffj documents to another location such as remote computer system 104 is to provide the 
y;i 15 ability to retrieve the information contained in the documents in case any of the original 
documents or the documents loaded onto computer system 102 are lost or destroyed. 

FIG. 2 is a flowchart 200 illustrating an example of steps that may be performed 
in a method for managing personal documents in an embodiment of the present 
invention. The method begins, step 202, and a document is loaded, step 204. The 
20 document may loaded by being retrieved electronically (for example, from a disk or from 
the Internet) or the document may be loaded by being scanned in from a paper 
document. The document has a format and a category associated with it, each 
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described further below. The format indicates whether the document is in paper or 
electronic form. The category relates to the function of the document. For example, 
some document categories include bills, invoices, receipts, bank and brokerage 
statements, tax returns, product warranties and checks. Categories such as bills and 
receipts may be divided up into subcategories such as credit card, utility, mortgage, or 
insurance. Miscellaneous categories may also be set up, step 212 described further 
below, so that the user may define categories that are not already defined in the 
system, for example documents coming from an external source such as business trip 
receipts. 

Once the document is loaded, step 204, the document is then characterized, 
step 206, to determine the document category. Categorizing a document may be done 
in numerous ways. The content of the document, or data items included in the 
document, might be used to characterize it. For example, one might search for a data 
item such as a bank account number to find a bank statement, or one might search for 
a data item such as the name of a company (such as the utility company) to find a bill. 
The shape of the document might be used separately or in conjunction with the 
document content to categorize the document. For example, long and skinny 
documents often are receipts such as a purchase receipt or an ATM machine receipt. 
Alternatively, the user could identify a pattern that may be used for categorizing the 
document. For example, a user could specify what a document is when it is input to the 
system and label the document as belonging to a particular category, such as a bill, a 
statement, etc. The user might also customize the system by training the system to 
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detect additional document types. This could be done by programming the system to 
accept and identify additional formats (by using layout or template information), text 
information (such as an account number or a merchant name), or images in a 
document (for example a logo). 

An embodiment of the present invention could optionally check to determine 
whether category-specific procedures are available, step 208. If the procedures are 
available on the system, then they could be applied to the document, step 210. If no 
procedures relating to the particular document category are found on the system, then 
a user might optionally train the system to handle a new category, step 212, by 
customizing the system. After training the system to deal with the category, step 212, 
the new procedures might then be applied to the document, step 210. If the 
categorization for the document has changed enough that the procedures do not apply, 
step 214, then the document might be re-categorized in step 206. Otherwise, 
processing continues to step 21 6, where the document information is extracted. 

Category-specific document handling procedures are applied to the document in 
step 210. The category-specific document handling procedures embody knowledge 
about flows of information between the documents entered into the document handling 
system. Documents may be organized by category, by a time component, or by 
transaction. For example, organization by category might include associating credit 
card receipts with credit card bills and checks that are used to pay the credit card bills. 
An example of organization by a time component might include keeping a list of credit 
card statements in order by date. An example of organizing documents by transaction 
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might include associating the warranties on purchased items with the receipts from the 
sale of the item and/or the credit card bill showing the purchase of the item. Checks 
and ATM receipts and their amounts may be associated with bank statements. One 
way of associating checks with bank account statements is to use the standard line at 
5 the bottom of a check that includes the account number and the amount that the check 
was cashed for. Trade confirmations may be associated with brokerage account 
statements. 

Knowledge about how these documents relate to home activities such as filing 
iCj tax returns, making insurance claims, and contesting bills, might also be reflected in the 
42 10 document handling procedures. A set of category-specific handling rules may be 
Hi implemented in the document handling system by default, and may be customized to 
|; meet a particular user's needs. 

% Based on the document category, information is extracted from the document, 

Q step 216. The information that may be extracted from the documents include for 

i 

jj 15 example, account numbers, due dates, check numbers, recipient names, etc. that are 
associated with the input documents. This information might be extracted by using text 
searching techniques, image identification techniques, or by identifying the format of a 
particular document. For example, a credit card bill tends to have the same format from 
month to month. By taking advantage of the layout of the document, the relevant 
20 information such as the account number, the purchases made, and the balance may be 
extracted based on their usual locations in the layout of the document. A template 
could be set up to reflect the credit card bill format. This template could be changed 
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when the credit card company changes the format of the bill. Credit card companies 
could even make their bill formats accessible via the Internet so that the user can 
download it into the document system so that the relevant information may be extracted 
accurately. 

5 The information extracted in step 216 might be used to link the document to 

other related documents, step 218. Optionally, documents may be backed up and 
retained, step 220, after which processing ends, step 222. Using the knowledge 
referred to above, an embodiment of the present invention also may be used to guide a 
Q user in setting up and carrying out a retention plan for home documents. Document 
4* 10 retention is a valuable feature because it provides the user the ability to retrieve the 
J L : retained document information if the original documents are lost, stolen or destroyed. 
The decision to retain and backup a document, step 220 is based on the document's 
category, age, and other information that a user might input into the system. Document 
retention rules reflect the fact that documents typically lose their usefulness after a long 



aj 15 enough period of time. For example, it is not necessary to keep tax records after 
approximately seven years, so a user may wish to dispose of them or not back them up 
to off-site storage. Also a user may wish to override and change any default document 
retention rules to fulfill a particular need, such as a desire to keep some documents 
private by not backing them up to a remote system on the Internet. In order to retrieve 
20 documents that have been backed up, the user simply accesses the remote computer 
system 104 and downloads them onto the local computer system 102 through the 
network 106. 
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While the embodiments of the present invention described herein have focused 
on personal document management featuring specific examples of financial document 
handling and Internet backup capability, other types of documents and backup methods 
could be used without departing from the spirit and scope of the present invention. 
Thus, it should be appreciated that the above description is merely illustrative, and 
should not be read to limit the scope of the invention or the claims. 
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